A Comparative Study on Operational Database, Data Warehouse and Hadoop File System

نویسندگان

  • T. Jalaja
  • M. Shailaja
چکیده

A Computer database is a collection of logically related data that is stored in a computer system, so that a computer program or person using a query language can use it to answer queries. An operational database (OLTP) contains up-to-date, modifiable application specific data. A data warehouse (OLAP) is a subject-oriented, integrated, time-variant and non-volatile collection of data used to make business decisions. Hadoop Distributed File System (HDFS) allows storing large amount of data on a cloud of machines. In this paper, we surveyed the literature related to operational databases, data warehouse and hadoop technology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Etl Workflow Generation for Offloading Dormant Data from the Data Warehouse to Hadoop

The technologies developed to address the needs of Big Data have presented a vast number of beneficial opportunities for use alongside the traditional Data Warehouse (DW). There are several proposed use cases for using Apache Hadoop as a compliment to traditional DWs as a Big Data platform. One of these use cases is the offloading of "dormant data" that is, infrequently used or inactive data fr...

متن کامل

The Cooperative Study Between the Hadoop Big Data Platform and the Traditional Data Warehouse

In this paper, based on the application conditions of the existing traditional data warehouse and the future forecast of the Hadoop big data platform, this paper proposes the new framework of the cooperation of Hadoop and traditional data warehouse which focus on the cooperation between the traditional data warehouse and the Hadoop technique to solve the problem that the traditional data wareho...

متن کامل

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

Data Availability and Durability with the Hadoop Distributed File System

The Hadoop Distributed File System at Yahoo! stores 40 petabytes of application data across 30,000 nodes . The most conventional strategy for data protection—just make a copy somewhere else—is not practical for such large data sets . To be a good custodian of this much data, HDFS must continuously manage the number of replicas for each block, test the integrity of blocks, balance the usage of r...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015